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COMPILING METHOD AND 
STORAGE MEDIUM THEREFOR 

Background of Invention 

[000 1 ] FIELD OF THE INVENTION 

[0002] The present invention relates to a method for efficiently using a stack register for 
a CPU architecture that employs a register stack. 

[0003] BACKGROUND ART 



[0004] Certain architectures for the processors (CPUs) used in computers employ register 

^ as register stacks. For example, in the 64-bit architecture CPU IA-64, developed 

Hi jointly by Intel Corp. and Hewlett Packard Corp., one part of general-purpose registers 

% serves as a register stack. 

y 
y | 

fi| [0005] For a register stack, only those registers, from among the physical registers 
prepared for a processor, that are required for the execution of procedures are 

IIS employed as logical registers (hereinafter, in the following explanation, referred to 

simply as a register when there is no need to differentiate between physical registers 
and logical registers). For each procedure, the number of registers employed are 
designated and allocated by an "alloc" instruction, and when a calling source is 
recovered, these registers are released. In the procedures, the number of registers 
designated (hereinafter also referred to as "allocated") by the alloc instruction are 
statically employed. 

[0006] The registers in a register stack (stack registers) are employed, from the bottom to 
the top, as "in" arguments, local variables and "out" arguments. When a procedure has 
been called, an allocated "out" argument for the procedure is renamed, at the calling 
destination, as "in" argument. 
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[0007] The registers that are allocated are obtained in order by renaming a definite 

number of physical registers. When at allocation time there is a shortage of physical 
registers (a stack overflow), registers that are currently allocated are automatically 
restored to the memory and a new physical register is obtained. Further, when, at the 
time task execution is recovered by the calling source, the registers allocated to the 
calling source are restored to the memory (a stack underflow), restored values are 
automatically recovered by the registers. 

[0008] Since a register stack is employed, the processing costs associated with the 

restoring and the recovery of registers can be avoided, so long as a stack overflow or 
a stack underflow does not occur. 

[0009] As is described above, when a register stack is employed in the CPU architecture, 
the costs associated with the restoring or the recovery of registers can be avoided, so 
% long as a stack overflow or a stack underflow does not occur. 

p [001 0] Recently, now that CPUs that can execute instructions in parallel have become 
% available, various register allocation methods devised to take advantage of parallel 

Wl instruction processing have been proposed for inclusion in program compilation 

techniques. According to these methods, to reduce the parallel execution related 

f ? ; barriers that may arise due to the occurrence of reverse dependencies or output 

HI 

fy dependencies, different registers are allocated for instructions that are executed in 

m parallel. Therefore, the number of registers that are employed for each procedure 

Ill tends to be increased. 

[001 1] While this method is indispensable for obtaining the parallel execution capacity of 
processors, the frequency whereat stack overflows or stack underflows occur is 
increased as the number of registers used rises. And since the cost of a stack overflow 
or a stack underflow is generally very high, the excessively frequent occurrence of 
stack overflows or of stack underflows will greatly deteriorate the execution 
performance of a program. 

[001 2] As is described above, conventionally, when a register stack is employed, the 

number of registers designated by the alloc instruction are statically employed for a 
specific procedure. 
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[001 3] However, when another procedure is called by a specific procedure, all allocated 
registers are not always employed. In this case, even though there are unused 
registers in a stack, new registers are allocated and stack resources are wasted. 

Summary of Invention 

[001 4] It is a feature of the present invention to minimize the occurrence of stack 
overflows and stack underflows in the generation of program code for processor 
architectures employing register stacks, so as to prevent performance deterioration, 
during the execution of a program, due to the occurrence of stack overflows or stack 
underflows. 

[001 5] It is another feature of the present invention to provide a method that ensures, 
during the generation of program code for a processor architecture that employs a 
register stack, that at the execution step of a procedure stack registers will not be 
under used, and will be employed efficiently. 

[001 6] To achieve the above features, according to the present invention, a compiling 

method for converting into object code a program written in source code includes the 
steps of, allocating registers for a program to be compiled and generating object code 
based on the register allocation. The step of allocating registers includes the steps of 
allocating logical registers for instructions in the program, and performing mapping 
between the logical registers and physical registers, so that the physical registers that 
are live at a procedure call in the program to be compiled are allocated from the 
bottom of the register stack. 

[001 7] According to the present invention, a code generation method for generating code 
for a program that controls a computer includes the steps of generating code while 
confirming that registers are allocated for a predetermined instruction and upon the 
calling of the procedure, so long as there is a vacancy in operation resources, copying 
the registers residing in the register stack, to free registers located at the bottom of 
the register stack. 

According to the present invention, a method, for employing a stack register when 
a processor with a register stack executes a program, comprises the steps of: when a 
different procedure is called in a predetermined procedure, reallocating registers that 
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are allocated for the execution of the predetermined procedure and that are live when 
the different procedure is called, and calling the different procedure, while the 
registers that are live when the different procedure is called are maintained in the 
stack, and as many registers as possible that are not present are abandoned; and 
upon the return from the different procedure, restoring the register image to the state 
immediately before the reallocation. 

[001 9] According to the invention, a method, for employing a stack register when a 

program is executed by a processor with a register stack, comprises the steps of: each 
time a procedure is called, packing and allocating existing logical registers; 
performing the procedure, and restoring the register image to the state before the 
packing. 

[0020] Further, according to the present invention, a program can be provided that 
■pj permits a computer to perform processes corresponding to the steps of the compiling 

method, the code generation method and the stack register employment method. This 
program can be distributed by being stored on a magnetic disk, on an optical disk, in 
a semiconductor memory or another storage medium, or by being transmitted via a 
§1 network by the storage device of a program transmission apparatus connected to the 

m network, 
fil 

fy [0021] Furthermore, according to the present invention, a compiler for converting into 
machine language code the source code of a program written in a programming 
language comprises: a register allocator, for allocating registers for instructions in the 
program to be compiled; and a code generator, for generating object code based on 
the register allocation process performed by the register allocator, wherein the 
register allocator allocates logical registers for instructions in the program to be 
compiled, and allocates, to physical registers, the logical registers that are allocated to 
the instructions of the program, so that the physical registers that are live at a 
procedure call in the program to be compiled are allocated from the bottom of the 
register stack. 

[0022] jn addition, according to the invention, a compiler for converting into machine 
language code the source code of a program written in a programming language 
comprises: a register allocator, for allocating registers for instructions in the program 
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to be compiled; and a code generator, for generating object code based on the 
register allocation process performed by the register allocator, wherein the code 
generator generates code while confirming that registers are allocated for 
predetermined instructions, and wherein, upon a procedure being called, the code 
generator, so long as there is a vacancy in operation resources, copies the registers 
residing in the register stack, to free registers that are located at the bottom of the 
register stack. 

[0023] According to the present invention, a computer having the following structure is 
provided. The computer comprises: input means, for entering source code of a 
program; and a compiler, for compiling the source code and converting the compiled 
code into machine language code, wherein, before a different procedure is called in a 
predetermined procedure of a program to be compiled, the compiler generates code 
for reallocating registers that are allocated for the execution of the predetermined 
procedure and that are live when the different procedure is called, so that only the 
registers remain in a stack, and generates code, for restoring the register image, upon 
the return from the different procedure, to the state immediately before the 
reallocation. 

[0024] Various other objects, features, and attendant advantages of the present invention 
will become more fully appreciated as the same becomes better understood when 
considered in conjunction with the accompanying drawings, in which like reference 
characters designate the same or similar parts throughout the several views. 

Brief Description of Drawings 

[0025] Fig. 1 is a diagram for explaining the general configuration of a compiler 
according to one embodiment of the present invention. 

[0026] Fig. 2 is a diagram showing existing intervals of logical registers when three 
procedure calls occur for a procedure wherein nine registers are allocated. 

[0027] Fig. 3 is a diagram showing the state wherein the logical registers in Fig. 2 are 
sorted based on a method according to the embodiment. 

[0028] 

Fig. 4 is a diagram showing the state wherein the logical registers in Fig. 3 are 
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sorted based on another method according to the embodiment. 

[0029] Fig. 5 is a flowchart for explaining a register allocation algorithm for a register 
allocator according to the embodiment. 

[0030] Fig. 6 is a flowchart for explaining the processing according to the embodiment 
wherein a code generator re-allocates stack registers by using register allocation 
results obtained by the register allocator. 

[0031] Fig. 7 is a flowchart for explaining another register allocation algorithm for the 
register allocator according to the embodiment. 

[0032] Fig. 8 is a flowchart for explaining a code generation algorithm for a code 
generator according to the embodiment. 

Detailed Description 

In the mapping step, allocation is done so that logical registers that are live across 
more procedure calls, are first allocated at the bottom of the stack. 

Also at the mapping step, allocation is done, so that logical registers that are live 
across a procedure call at which fewer logical registers are live at the same time are 
first allocated at the bottom of the stack. 

The step of reallocating the registers and calling the different procedure includes 
the steps of: sorting and reallocating, from the bottom of the register stack, the 
registers that are live when the different procedure is called. 

The register allocator allocates the logical registers and the physical registers first 
for an important portion of the program to be compiled. While for a less important 
portion of the program, the code generator generates compensation code for 
allocation of the logical registers and for allocation of the physical registers for the 
important portion. 

The importance level of the portion in this program can be determined by 
referring to the execution frequency of the pertinent portion when the program is 
executed. That is, logical registers and physical registers are allocated first for the 
portion having the highest execution frequency that is therefor regarded as the 
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important portion. 



[0038] The preferred embodiment of the present invention will now be described in detail 
while referring to the accompanying drawings. 

[0039] Fig. 1 is a diagram for explaining the general configuration of a compiler 
according to the embodiment. 

[0040] A compiler 1 0 in Fig. 1 receives, from input means 20, a program (source code) to 
be compiled, compiles the program, and generates and outputs object code written in 
a machine language. 

[0041 ] In Fig. 1 , the compiler 1 0 of this embodiment comprises: a register allocator 1 1 
for allocating registers for instructions in a program; and a code generator 1 2 for 
converting into object code source code for which register allocation has been 
Pi performed. 



CP [0042] The register allocator 1 1 performs mapping between variables and logical 

y% registers using a coloring method, and performs further mapping between the logical 



IB 



r§ r. 



C3 



registers and physical registers in order to perform register allocation. In this 
embodiment, the register mapping is performed using a method, which will be 
explained later, that is especially effective for ensuring the efficient use of a register 
stack. 

[0043] The code generator 1 2 generates object code corresponding to the source code to 
be compiled. At this time, based on the register mapping results obtained by the 
register allocator 1 1 , an "alloc" instruction for initiating the use of a register stack and 
an instruction for resetting an argument are generated. Further, in this embodiment a 
method, which will be described later, is employed to generate code while taking into 
account the efficient use of the register stack. 

[0044] The compiler 1 0 in Fig. 1 can be implemented by a personal computer, a work 

station or another computer, and the components shown in Fig. 1 are virtual software 
blocks of code run by a CPU that is controlled by a computer program. The computer 
program that controls the CPU can be provided by being stored on a storage medium, 
such as a CD-ROM or a floppy disk, or by being transmitted via a network. 



APP JD=1 0063958 Page 7 



[0045] The components of the compiler 10 in Fig. 1 are related to the characteristic 

functions of the invention. Although not shown, the compiler 10 actually comprises 
general compiling process functions, such as phrase analysis, syntax analysis and 
input program optimization. 

[0046] The input means 20 in Fig. 1 can be implemented by a reader for reading a 

program (source code) from a predetermined storage device, or a network interface 
for receiving a program (source code) from another computer via a network. 

[0047] It is assumed that the object code compiled by the compiler 1 0 in this 

embodiment is executed by a CPU having a CPU architecture that employs a register 
stack, such as the IA-64 microprocessor. Therefore, for each procedure, the number 
of registers to be used are designated by an alloc instruction, and are obtained from 
among the available physical registers. When the task performed by the called 
procedure is completed, the registers allocated for the procedure are released. 

[0048] Even if the parallel oriented register allocator 1 1 requires many registers, all the 
registers initially obtained need not be used for storage, regardless of whether the 
procedure is called. As a consequence, in this embodiment the allocation of registers 
is performed precisely. 

[0049] Specifically, when a certain procedure issues a call to another procedure, 

designated registers for which instructions have been allocated (hereinafter this state 
is described as being alive) by the calling procedure and that are in use when the call 
to the other procedure is issued are re-packed and re-allocated for the stack register, 
and the different procedure is called again. In this process, since the registers 
allocated for the original procedure that are not in use when the different procedure is 
called (hereinafter this state is described as being dead) are released, the count of the 
registers allocated for the original procedure is reduced by a number equivalent to 
that of the released registers. 



[0050] 



At the time task execution is recovered from the called procedure, the register 
image is restored to its original state. This is done because the maximum number of 
registers allocated by the original alloc instruction are required by the original 
procedure that issued the call to the other procedure. That is, in this embodiment, 
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only those registers required by the called procedure are obtained, while the 
remainder is released, and when task execution is recovered from the called 
procedure by the original procedure, the register image must be restored to its 
original state. 

[0051] As is described above, since the packing of the stack register is performed each 
time a procedure is called, currently dead registers can be effectively utilized, so that 
the frequency whereat stack overflows occur can be reduced. 

[0052] However, when the allocation of the registers is precisely performed each time a 
procedure is called, execution time may be increased by the register shifting that is 
performed each time the registers are packed. 

[0053] Therefore, in this embodiment, the following two methods are proposed for 

performing the packing of registers while avoiding the deterioration of code quality 
that can accompany the shifting of registers. 



y [0054] A first method is a register allocation method employed while taking register 

* packing into account (methods (1) and (2), which will be described later). According to 

59 this method, registers that are live when a procedure is to be called are allocated and 

moved to the bottom of a stack in advance. The processing for this method is 
111 performed by the register allocator 1 1 of the compiler 1 0 in Fig. 1 . 

ry 

g [0055] A second method is a register packing method performed during code generation 
III each time a procedure is called (method (3), which will be described later). The 

processing for this method is performed by the code generator 1 2 of the compiler 1 0 

in Fig. 1 . 

[0056] For the first method, wherefor the register allocator 1 1 performs the register 

packing, the code generator 1 2 performs a normal code generation process based on 
the register allocation performed by the register allocator 1 1 . For the second method, 
wherefor the code generator 1 2 performs the register packing, the register allocator 
1 1 performs only normal register allocation, and shifts the program control to the 
code generator 12. 

[0057] The individual methods will now be described in detail. 
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[0058] 



(l)Method for performing register allocation while taking register packing into 
account. 



[0059] According to this method, after the mapping of a variable and a logical register is 
determined by employing a method such as graph coloring, the mapping of a logical 
register and a physical register is determined at the time a procedure is called while 
taking into account the register packing that is performed. 

[0060] According to this method, the priority ordering of logical registers is determined 
based on the following two strategies, and in accordance with this ordering, physical 
registers are allocated from the bottom of the register stack. 

[0061 ] Strategy 1 : A higher priority is provided for logical registers, such that procedures 
wherein the existing intervals of logical registers are overlapped are frequently called. 



If! 
Ill 

o 



[0062] Strategy 2: A higher priority is provided for a smaller number of logical registers, 
the existing intervals of which are overlapped, that are alive at the same time as a 
procedure call occurs. 



01 

111 [0063] These two strategies will now be described while referring to Figs. 2 to 4. 

[0064] Fig. 2 is a diagram showing the existing intervals of nine logical registers, a to i, at 
the times whereat three procedure calls (call [1 ], call [2] and call [3]) occur for the 
procedure to which the registers are assigned. Fig. 3 is a diagram showing the state 
wherein the logical registers in Fig. 2 are sorted based on strategy 1 . And Fig. 4 is a 
diagram showing the state wherein the logical registers in Fig. 3 are further sorted 
based on strategy 2. 

[0065] As is shown in Fig. 2, when the registers are allocated, from the bottom of the 
stack in the order a to i, three registers f, g and h are dead at the time of procedure 
call [1]. Similarly, five registers a, d, e, g and h are dead at the time of procedure call 
[2], and two registers c and h are dead at the time of procedure call [3]. That is, these 
registers are wasted. 

[0066] Thus, assume that by using strategy 1 , a logical register for a procedure for which 
there are many procedure calls and wherein the existing intervals of the logical 
registers are overlapped, is first allocated and then moved to the bottom of the 
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register stack. In Fig. 3, the state wherein the registers are thus sorted is shown. In 
Fig. 3, the logical registers for procedures for which there is the same procedure call 
and wherein the existing intervals of the registers are overlapped are arranged in 
dictionary order. 

[0067] In Fig. 3, the logical registers b and i, which are relevant to all three procedure 
calls, are moved to the bottom of the stack. Therefore, three upper registers are 
empty at call [1], and similarly, two upper registers are empty at call [2] while one 
upper register is empty at call [3]. Thus, when the procedures requested by these 
procedure calls are executed, these registers are released and the re-allocation of 
registers is enabled. 

[0068] However, even in this state, three registers a, d and e are dead at procedure call 
[2], and one register c is dead at procedure call [3], so that these registers are wasted. 

y [0069] Therefore, while using strategy 2, assume that, at the time of the call for the 
SI procedure, the existing intervals of which are overlapped, simultaneously present 

"3 registers, the number of which is small, are first allocated and then moved to the 

Ifl bottom of the register stack. In Figs. 2 and 3, for registers a, c, d, e and f, for each of 

$ which there are two procedure calls, the existing intervals of which are overlapped, 

the number of like logical registers that are alive at the time of call [2] is two, which is 
III smaller than the number of registers at the time of the other calls (four registers in 

PI both cases). Therefore, higher priorities are awarded the logical register c and the 

fU logical register f, which are related at this time. In the state in Fig. 4, the registers are 

sorted in this manner. It should be noted that in Fig. 4 the logical registers, the 

existing interval of which is overlapped, that have the same procedure calls are 

arranged in dictionary order. 

[0070] In Fig. 4, as is described above, the logical registers c and f are moved toward the 
bottom of the stack so that they follow the logical registers b and i. As a result, the 
number of upper registers that are empty at call [1] is decremented by one to two, 
while five registers are empty at call [2], and one register is empty at call [3]. 
Therefore, when a procedure for which execution is requested by the procedure calls 
is executed, these empty registers are released and re-allocation is enabled. 
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[0071] Through the above processing, the number of registers that remain dead are two 
as a result of calls [1 ], [2] and [3]. Compared with the state in Fig. 2, eight registers 
are apparently effectively employed and optimal results are obtained. 

[0072] It should be noted that the sorting using strategy 2 is performed with lower 
weighting than is the sorting using the strategy 1 . Therefore, in the sorting using 
strategy 2, the order that is determined using strategy 1 (in accordance with the 
frequencies of the procedure calls) need not be changed. For example, even when the 
logical register g is not alive at call [3] but is at call [2], in the state in Fig. 4 the logical 
register g is not moved and placed closer to the bottom than the logical register e. 

[0073] Fig. 5 is a flowchart for explaining the register allocation algorithm of the register 
allocator 1 1 using the method of the invention. 

[0074] As is shown in Fig. 5, the register allocator 1 1 first analyzes an interval wherein a 
variable exists (step 501). Then, based on this analysis, the register allocator 1 1 
performs the normal register allocation for logical registers (step 502). 

[0075] Following this, the register allocator 1 1 counts, for each logical register, the 
number of procedure calls that are present in the existing intervals of the register 
(step 503). Further, the register allocator 1 1 counts, for each procedure call, the 
number of logical registers for which existing intervals overlap (step 504). The 
processes at steps 503 and 504 may be inversely performed. 



U [0076] Thereafter, the register allocator 1 1 sorts the logical registers in the descending 
order of the count, for each logical register, of the procedure calls in which 
overlapped existing intervals are present (step 505). If the number of procedure calls 
is the same, the logical registers are sorted, for example, in dictionary order. 

[0077] Next> the re gj Ster allocator 1 1 sorts the logical registers beginning with those for 
which, at the time of the procedure call, the number present is the smallest (step 
506). Specifically, all the procedure calls are detected in which overlapped existing 
intervals of predetermined registers are present, and an average value is calculated for 
the number of logical registers, the existing intervals of which are present in these 
procedure calls. When the same number of logical registers are simultaneously alive 
during procedure calls, the logical registers are sorted, for example, in dictionary 
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order. 



[0078] Finally, based on the sorting results obtained at steps 505 and 506, the register 
allocator 1 1 maps, on the logical registers, stack registers that are physical registers, 
from the bottom of the register stack (step 507). 

[0079] After the register allocator 1 1 has completed the register allocation, the code 

generator 1 2 generates code actually used for implementing the efficient use of the 
stack register. 

[0080] Fig. 6 is a flowchart for explaining the processing performed by the code 

generator 12 for reallocating stack registers using the register allocation results. It 
should be noted that normal code generation is performed by the code generator 1 2 
before the processing in Fig. 6 is begun, and that after this generation the code 
generator 1 2 performs the processing in Fig. 6 for each procedure in the program. 



ill 



O [0081] In Fig. 6, first, from among the logical registers that are alive at a target procedure 

■ 81 

call, the code generator 1 2 detects the logical register at the top of the register stack 
(steps 601 and 602). Then, while it is assumed that the location of the detected logical 
03 register in the register stack is a new stack top of the register stack to be reallocated, 

» : a new frame size is calculated (step 603). 

RJ [0082] Subsequently, the code generator 1 2 re-accumulates arguments based on the new 

, ft 

§ frame size obtained at step 602 (step 604). Specifically, first, a new argument is 

IU accumulated at a vacant argument accumulation destination, and then, a further 

argument is accumulated for the vacant argument accumulation destination. This 
process is repeated until there are no more vacant argument accumulation 
destinations. Further, an unprocessed argument is accumulated by using a working 
register. 

[0083] When the re-accumulation of arguments has been competed, the code generator 
1 2 generates an alloc instruction to generate a new frame, a procedure call instruction 
(call instruction), and an alloc instruction for restoring the frame to its original state 
(step 605). 

[0084] 

When the processes at steps 602 to 605 have been performed for all the 
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procedures in the program, the processing is terminated (step 601). 



O 



[0085] (2)Method using the locality of a program. 

[0086] According to this method, the method (2) is locally applied for a program, while 
the method (1) is used as a basis. 

[0087] The program is normally constituted by multiple parts, the importance of which, 
depending on the parts, may differ. Therefore, according to this method, instead of 
applying the method (1) for the entire program, register packing is performed first for 
the important parts of the program. 

[0088] Specifically, a hot spot that is very important is designated by using profile 

information, and registers are so allocated that at a procedure call present along the 
path, the maximum packing of the registers is performed. Furthermore, a 
compensation code for the packing performed for a path that constitutes a hot spot is 
output for a path that is not a hot spot. 



J [0089] The importance levels of the parts of the program are determined relatively. That 
J£J is, packing is performed so that the stack registers can be employed the most 

? efficiently for the most important part, and if the importance level of this part is 

sj| reduced, a compensation code for the packing of a currently more important part is 

fll generated. 

|| [0090] Fig. 7 is a flowchart for explaining a register allocation algorithm for the register 
allocator 1 1 using the method of the invention. While taking into account that register 
movement is not generated at a hot spot, this processing is performed using the 
algorithm in the flowchart in Fig. 7. 

[0091] As is shown in Fig. 7, first, the register allocator 11 of the compiler 10 divides a 
program into traces (step 701), and sorts the traces based on their importance levels, 
such as their execution frequencies (step 702). 

[0092] Then, to prepare a pad for register shifting, the register allocator 1 1 divides a less 
important trace as needed (step 703). Specifically, when there is an outflow from a 
less important trace to a more important trace, the less important trace is divided at 
the outflow point. Further, when there is an inflow from a more important trace to a 
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less important trace, the less important trace is divided at the inflow point. 

[0093] The register allocator 11 sequentially performs the following process for each of 
the thus sorted traces (i.e., in the descending order of their importance levels) (steps 
704 and 705). 

[0094] First, for a target trace, the logical registers are sorted using method (1) (step 
706). 

[0095] Then, a check is performed to determine whether the mapping of the logical 

register and the physical register for a different trace is to be transmitted to the head 
of the target trace. When the mapping is transmitted, register shifting is generated at 
the head of the target trace in order to compare the transmitted mapping and the 
sorting results at step 706 (steps 707 and 708). 

[0096] A check is then performed to determine whether there is an outflow from the 
target trace to a predetermined trace (step 709), and when there is no outflow, the 
processing is terminated. When there is an outflow, a check is further performed to 
ascertain whether the mapping of the logical register and the physical register at the 
head of a destination trace has been determined (steps 709 and 710). 

[0097] When the mapping of the logical register and the physical register at the head of 
the destination trace has not yet been determined, the register allocator 1 1 transmits, 
to the head of the destination trace, the mapping of the logical register and the 
physical register for the target trace (steps 71 0 and 71 1). Since the traces are 
employed in the descending order of their importance levels, it is ensured that the 
importance level of the destination trace is not higher than that of the target trace. 

[0098] when the mapping of the logical register and the physical register at the head of 
the destination trace has already been determined, it is assumed that the importance 
level of the destination trace is higher than that of the target trace. In this case, the 
mapping of the logical register and the physical register has already been determined 
only for the trace that from the end of the target race is the outflow destination. This 
is because if at step 703 there is an outflow from a less important trace to a more 
important trace, the less important trace is divided at the outflow point, so that it is 
ensured that the mapping of the logical register and the physical register, at the head 
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of the trace that from a specific trace is the outflow destination, has not yet been 
determined. 



[0099] Therefore, the register shifting is generated at the end of the target trace in order 
to compare the sorting results for the target trace at step 706 and the mapping of the 
logical register and the physical register for the outflow destination trace (steps 71 0 
and 712). 

[01 00] When the processes at steps 706 to 71 3 have been completed for all the traces 
(step 704), the register allocation process performed by the register allocator 1 1 is 
terminated. 

[0101] Following this, the code generator 1 2 employs the register allocation results 
obtained by the register allocator 1 1 to generate code to actually implement the 
(j, efficient use of the stack register. Since the code generation process is performed by 

the code generator 1 2 in the same manner as the process using the method (1 ) in the 

£0 flowchart in Fig. 6, no detailed explanation for it will be given. 

141 

[0102] Through the above processing, the stack register can be efficiently used as a 
preference for an important part (e.g., a frequently executed part) of the program. 
Therefore, although there are some parts of a program for which the use of a stack 



m 



MJ register is less efficient, the execution function of the entire program can be 

ijsi 

y3 improved. 

P 

[0103] The processes using the methods (1) and (2) are performed as a part of the 

register allocation process by the register allocator 11 . Since the code generator 1 2 
generates codes in accordance with the allocation results obtained while taking the 
register packing into account, the program execution time will not be increased due to 
the register shifting that occurs during the packing of the registers. 

[0104] (3)Method for performing register packing for each procedure call during code 
generation. 



[0105] 



According to this method, code is generated by tracking registers to determine 
whether they are alive, and so long as there is a vacancy in operation resources at a 
procedure call, the contents of a register that is alive and is present near the top of 
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the stack are copied to a dead register that is present near the bottom of the stack. 

[0106] While a processor that can process instructions in parallel can simultaneously 
execute multiple instructions, the number of instructions appropriate for the 
execution capability (operation resources) of a processor are not always executed 
because of the need for synchronization of the instructions. Therefore, according to 
this method, when there is a vacancy in operation resources, register packing is 
performed using the empty resource. 

[01 07] Fig. 8 is a flowchart for explaining the code generation algorithm of the code 

generator 1 2 using the method (3). This method is carried out by the algorithm in the 
flowchart in Fig. 8. 

[01 08] As is shown in Fig. 8, first, when an instruction scheduling window (hereinafter 
referred to simply as a window, as in Fig. 8) is flashed upon the detection of a 
procedure call, the code generator 12 of the compiler 10 searches the window and 
p determines whether an alloc instruction can be generated, and how many copy 

JJ instructions can be generated at that time (step 801). 

3} ■ [01 09] Then, the code generator 1 2 detects stack registers that are dead at the time of a 
3 procedure call, and sorts them, beginning with the one at the bottom of the stack 

p (step 802). 



; -::;r 



[01 1 0] Next, the code generator 1 2 detects an argument that is not set in the window, 
and obtains a copy instruction slot to re-set the argument (step 803). When a copy 
instruction slot can not be obtained (step 804), the processing performed using the 
method (3) is terminated. 

[0111] Following this, the code generator 1 2 detects stack registers that are alive at the 
time of the procedure call, and sorts them beginning with the one at the top of the 
stack (step 805). The sorted stack registers are sequentially copied to dead stack 
registers so long as a copy instruction slot is available (step 806). The dead stack 
registers are used sequentially, beginning with the one at the bottom of the stack. 



[0112] 



At the obtained location for the generation of an alloc instruction, the code 
generator 12 generates an alloc instruction having a reduced register stack (step 807), 
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and renames the argument in accordance with the reduction results (step 808). For an 
argument that is not set in the window, a copy instruction for re-setting the argument 
is generated at the location obtained at step 803 (step 809). 

[01 1 3] Finally, at the end of the procedure call, the code generator 1 2 generates a copy 
instruction and an alloc instruction in order to restore, to its original state, the 
register image that has been moved as is described above (step 810). 

[01 1 4] Through the above processing, the contents of a register can be moved toward 
bottom of the register stack and the register at the top released, but only when there 
is a vacancy in operation resources of the processor that is adequate for the 
performance of register packing. Therefore, the efficient use of stack registers is 
acquired using register packing, and in addition, an increase in the program execution 
time due to the register movement that occurs during register packing can be 
g prevented. 

|fi [01 15] As is described above, according to the present invention, in the generation of 
jpj program code for a processor having an architecture that includes a register stack, the 

■IJ1 occurrence of stack overflows or stack underflows can be suppressed, and 

fij ■ 

deterioration of the execution function of the program can be prevented. 

Q. 

Ill [01 1 6] Further, according to the present invention, during program code generation, the 

III 

\q occurrence of a register that is not used upon the execution of a procedure can be 

suppressed, and stack registers can be efficiently employed. 

[01 1 7] It is to be understood that the provided illustrative examples are by no means 
exhaustive of the many possible uses for my invention. 

[01 1 8] From the foregoing description, one skilled in the art can easily ascertain the 

essential characteristics of this invention and, without departing from the spirit and 
scope thereof, can make various changes and modifications of the invention to adapt 
it to various usages and conditions. 

[01 1 9] It is to be understood that the present invention is not limited to the sole 

embodiment described above, but encompasses any and all embodiments within the 
scope of the following claims: 



fi 
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